Maximum mutual information SPLICE transform for seen and unseen conditions

نویسندگان

  • Jasha Droppo
  • Alex Acero
چکیده

SPLICE is a front-end technique for automatic speech recognition systems. It is a non-linear feature space transformation meant to increase recognition accuracy. Our previous work has shown how to train SPLICE to perform speech feature enhancement. This paper evaluates a maximum mutual information (MMI) based discriminative training method for SPLICE. Discriminative techniques tend to excel when the training and testing data are similar, and to degrade performance significantly otherwise. This paper explores both cases in detail using the Aurora 2 corpus. The overall recognition accuracy of the MMI-SPLICE system is slightly better than the Advanced Front End standard from ETSI, and much better than previous SPLICE training algorithms. Most notably, it achieves this without explicitly resorting to the standard techniques of environment modeling, noise modeling or spectral subtraction.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Error - weighted discriminative training for HMM parameter estimation

Optimizing discriminative objectives in HMM parameter training proved to outperform Maximum Likelihood-based parameter estimation in numerous studies. This paper extends the Maximum Mutual Information objective by applying utterance specific weighting factors that are adjusted for minimum sentence error. In addition to that, the paper investigates tuning separate numerator and denominator weigh...

متن کامل

Assessment of the Wavelet Transform for Noise Reduction in Simulated PET Images

Introduction: An efficient method of tomographic imaging in nuclear medicine is positron emission tomography (PET). Compared to SPECT, PET has the advantages of higher levels of sensitivity, spatial resolution and more accurate quantification. However, high noise levels in the image limit its diagnostic utility. Noise removal in nuclear medicine is traditionally based on Fourier decomposition o...

متن کامل

Improvements in linear transform based speaker adaptation

This paper presents three forms of linear transform based speaker adaptation that can give better performance than standard maximum likelihood linear regression (MLLR) adaptation. For unsupervised adaptation, a lattice-based technique is introduced which is compared to MLLR using confidence scores. For supervised adaptation, estimation of the adaptation matrices using the maximum mutual informa...

متن کامل

Combining Feature Space Discriminative Training with Long-Term Spectro-Temporal Features for Noise-Robust Speech Recognition

Discriminative training of feature space using maximum mutual information (fMMI) objective function has been shown to yield remarkable accuracy improvements. For noisy environments, fMMI can be regarded as an effective noise compensation algorithm and can play a significant role for noise robustness. Feature space speaker adaptation techniques such as feature space maximum likelihood linear reg...

متن کامل

Discriminative Linear Transforms for Speaker Adaptation

Linear transform adaptation techniques such as Maximum Likelihood Linear Regression (MLLR) are a popular and effective family of methods for speaker adaptation. MLLR estimates transform parameters for Gaussian means and variances using a maximum likelihood (ML) objective function. This paper discusses the use of an alternative discriminative objective function for linear transform estimation, w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005